Including uncertainty of speech observations in robust speech recognition

نویسندگان

  • José C. Segura
  • Ángel de la Torre
  • Javier Ramírez
  • Antonio J. Rubio
  • M. Carmen Benítez
چکیده

Noise compensation methods for speech recognition provide a cleaned version of the speech representation. Usually this cleaned version is the expected value of the speech parameters given the observed noisy speech and the noise statistic. A more realistic representation should include the probability distribution of the cleaned speech instead of its expected value in order to represent the uncertainty associated to the compensation process due to the variability of the noise process. Recently, the inclusion of the uncertainty in the recognition process has been studied. Some approaches represent the uncertainty in the HMM parameters values. Other approaches represent it in the feature space. This second approach offers a much simpler system implementation and lower computational cost. In this paper we have developed a noise compensation technique that incorporates the variance of the cleaned speech into the speech representation. The variance is estimated using a Wiener filter during the speech feature enhancement process. This way of including the uncertainty implies the modification of the decoding rule. Experimental results using AURORA 2 database demonstrate a sustained improvement of the performance in the recognition system (about 21% word error rate reduction) when uncertainty is considered in the decoding rule.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Uncertainty Decoding for Noise Robust Automatic Speech Recognition

This report presents uncertainty decoding as a method for robust automatic speech recognition for the Noise Robust Automatic Speech Recognition project funded by Toshiba Research Europe Limited. The effects of noise on speech recognition are reviewed and a general framework for noise robust speech recognition introduced. Common and related noise robustness techniques are described in the contex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004